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Abstract 

Factor analysis refers to a statistical model in which observed variables are condition- 
ally independent given fewer hidden variables, known as factors, and all the random 
variables follow a multivariate normal distribution. The parameter space of a factor 
analysis model is a subset of the cone of positive definite matrices. This parameter 
space is studied from the perspective of computational algebraic geometry. Grobner 
bases and resultants are applied to compute the ideal of all polynomial functions 
that vanish on the parameter space. These polynomials, known as model invariants, 
arise from rank conditions on a symmetric matrix under elimination of the diagonal 
entries of the matrix. Besides revealing the geometry of the factor analysis model, 
the model invariants also furnish useful statistics for testing goodness-of-fit. 

1 Introduction 

In factor analysis, correlated continuous variables are modeled as conditionally indepen- 
dent given hidden (latent) variables that are termed factors. Sometimes factor analysis 
serves as a tool for dimension-reduction; the possibly many observed variables are sum- 
marized by fewer factors. However, in many applications the focus is on interpreting 
the factors as unobservable theorized concepts. In fact, the desire to explain observed 
correlations between individuals' exam performances by the co ncept of intelligence wa s 
the driving force in the original development of factor analysis ( Spearman . 1904 . 19271 ). 

Currently, statistical inference in factor analysis is often based exclusively on para- 
metric representations and on max imum likelihood estimate s computed in iterative pro- 
cedures such as the EM algorithm (jRubin and Thaverl . I1982T ) . In the early days of factor 
analysis, however, much attention was directed to model invariants, that is, to polyno- 
mial equality relations that the mo del impo s es on the entries of the covariance matrix of 
the observed variables. We refer to Harman ( 19761 . Sect. 1) for so me history. It should be 
noted that factor ana lysis also leads to inequality constraints (e.g. iBekker and de Leeuwl . 
1987 : Harman . 19761 . p. 117), which we do not address here. The best known invariants 
are the tetrads, also called tetrad differences, which arise in one-factor models. The 
name tetrad reflects that the polynomial arises in one-factor analysis with four observed 
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variables. For example, if = (tpij) € 



o4x4 



is a covariance matrix in the parameter 



space of a one-factor analysis model, then there are, up to sign change, three tetrads, 
namely 

^12^34 - ^14^23, ^14^23 ~ ^13^24, ^12^34 ~ ^13^24, (1) 

and all three tetrads evaluate to zero. Tetrads have played a major role throughout the 
history of factor analys is. They also appear in recent research, for example, in work 
on model identifiability rtGrzebvk et all 2004') and on d ichotomized Gaussian models for 
multivariate binary variables Jcox and wJ B . While tetrads are ubiquitous in 
the literature, there has been very li ttle work atte mpting to find invariants of models with 
more than one factor. The work by iKellevi (|1935l ) who derived the pentad, a fifth degree 
polynomial vanishing over covariance matrices from two-factor models, constitutes the 
exception. Since then virtually no p rogress h a s bee n made towards determining the 
invariants of factor analysis models. Harman ( 19761 . p. 77) summarizes the state of 
knowledge as follows: "When the number of factors is greater than two the work of 
computing determinants of the fourth or higher order becomes so laborious that no explicit 
conditions corresponding to the tetrads or pentad criterion have been worked out. " 

Computational difficulties aside, the apparent ease of data analysis using solely the 
parametric model representation has inhibited progress on the determination of higher- 
order invariants. However, parametric approaches are not without their problems. On 
one hand , the likelihood function of a factor analysis model may have multiple local 
maxima ([Rubin and Thaverl . Il982l ). rendering its maximization difficult. On the other 
hand, the use of information criteria suc h as BIC in explora tory factor analysis is com- 
plicated by the presence of singularities (jGeig er et allEoOlh . We believe that a better 
mathematical understanding of factor analysis mode l s will be helpful in addressing these 
issues. One step in this direction is the work of lEilisI (|2004l ) who applied algebraic topol- 
ogy to study singularities arising in factor analysis. Our interest lies in the algebraic 
geometry that expresses itself in the model invariants. This has a pragmatic side because 
the invariants can serve as useful statistics for testing model fit and for constraint-based 
model selection. Both the desire to find new test statistics as well as the wish for an 
understanding of the geometry of factor analysis constitute the motivation for this paper. 

The paper is outlined as follows. We begin with a review of the factor analysis model 
(Section [2]) and discuss the use of invariants as test statistics (Section [3]). In Section U] 
we place the problem of determining; invariants in the framework of algebraic statistics 
( Pachter and Sturmfelsl . 2005 ; Pistone et al. . 2001 ). In fact, this is one of the first studies 
in algebraic statistics which deals with continuous rather than discrete random variables. 

We shall see in Section [5] that the tetrads form a Grobner basis of the ideal of 
invariants of a one-factor model. This implies that all invariants of one-factor models 
can be written a s polynomial combin ations of tetrads, which is claimed correctly but 
without proof in iGlvmour et al.l (|1987l . p. 84). For models with two or more factors, 
we performed extensive Grobner basis computations, using Macaulay 2 and Singular, 
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Figure 1: Graphical representation of the factor analysis model Fs^. 



for factor analysis models with up to nine observed variables and up to five factors. 
In Section [6l we show that multilinear resultants provide a useful method of finding 
individual invariants even when the whole ideal of invariants cannot be determined. 

Our computational experiments lead to a series of conjectures and problems pre- 
sented in Section [71 In particular, we conjecture for two- factor models that 3 x 3-minors 
and pentads generate the ideal of invariants. For models with arbitrarily many factors we 
conjecture that the ideal is generated by polynomials arising from consideration of sub- 
matrices whose size depends on the number of factors but not on the number of observed 
variables. We believe that these conjectures are of independent interest for commutative 
algebra. In Section [8] we propose some future research directions of statistical interest. 



2 Factor analysis 

Factor analysis concerns a Gaussian hidden variable model with p observed variables 
Xi, where i G \p] = {1, . . . ,p}, and m hidden variables Yj, where j E [m] = {1, . . . , m}. 
It is assumed that (X, Y) follows a joint multivariate normal distribution with positive 
definite covariance matrix. The factor analysis model F Pjm is defined by the requirement 
that the observed variables Xi, i E [p], are conditionally independent given the hidden 
variables Yj, j G [m]. T he factor analys is model F P)m can be visualized using the 
graphical model formalism ( Lauritzenl . 19961 ). in which the dependence structure between 



observed and hidden variables is encoded by an acyclic directed graph. This directed 
graph has the vertex set {X%, . . . , X p , Y\, . . . , Y m }, and the edges are Yj — ► Xi for all 
j G [m] and i G \p], as shown in Figure [1] for m = 2 and p = 5. 

We start out by deriving the following parametric representation of our model. 

Proposition 1. The factor analysis model F Pi7n is the family of multivariate normal 
distributions N P ((J,, ^) on MP whose mean vector \x is an arbitrary vector in MP and 
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whose covariance matrix lies in the (non-convex) cone 

F P:m = {£ + AA* G R pxp : S > diagonal, A G M pxrra } 2 
= { X + r G M pxp : £ > diagonal, T > symmetric, rank(T) < m}. 

Here the notation ^4 > means that ^4 is a positive definite matrix (i.e., all eigenval- 
ues are positive), and similarly ^4 > means that A is a positive semidefinite matrix. 

Proof. Consider the joint covaricince matrix of the fully observed, model underlying Fp m , 

- C £)■ w 

The entries of this matrix are constrained by the conditional independence statements 
XilLXjUY^Yz,...^} (l<i<j<p), (4) 

which translate into the vanishing of the corresponding (m + 1) x (m + l)-determinants: 

det (l| %*) = ^-detW - A«.adj(cD).A*, = 0. (5) 

We refer to Matus <|2005h for a general discussion on how to translate conditional inde- 
pendence statements for Gaussian random variables into polynomial algebra. 

The determinantal constraint ([5]) allows us to block-diagonalize the positive definite 
matrix ([3]) as follows: 



E 0\ fl p -K<5>- 1 \ A\ / I p 



$) VO I m U* $7 V-S^A* Jm 



(6) 



Upon multiplication by det($) > 0, the entry of the matrix S = *S — A • <3? _1 • A* in row 
i and column j is equal to ©, so this positive definite matrix is diagonal if and only if 
X satisfies the model F PiJn . This holds if and only if its covariance matrix \& has the 
form = S + A • <3?~ 1 • A* if and only if ^ is in the cone F Pim . □ 

In what follows we generally identify the factor analysis model F PiJn with its param- 
eter space Fp m . The description given in Proposition [Tl shows that F pm is a parametri- 

cally presented subset of the space r( 2 ) of symmetric p x p-matrices. The dimension 
d = dim(i ? P)m ) of the model F Pim is the maximal rank of the Jacobian matrix of that 
parametrization. The codimension of F Pjm is ( p ^ 1 ) — d. 

Theorem 2. The dimension and the codimension of the factor analysis model are 
dim(F P)m ) = min j p(m + 1) - ( ™ J , ( P ^ 1 
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codim(i ? p m ) = max <. f ^ J — m, 
Thus the codimension of the factor analysis model is positive if and only if 



P — 



1 , 1 

m H — V8m + 1 H — 



+ 1. 



(7) 



Proof. Using orthogonal transformations as in the QR-decomposition, every £ F Pt 



p,m 



can be written as = S + AA with A = (Ajj) being lower-triangular in the sense that 



A g L 



p,m 



{A G M p> 



A; 



for all 1 < i < j < m} ; 



see also Anderson and Rubinl (1956, pp. 121 and 124). Thus the factor analysis model 



Fp jTn is the image of the following polynomial map: 



P m v T 
>0 ^PjTn 



r( P 2 1 ) 5 (E, A) 
The coordinates of the parametrization (jSJ) are 



S + AA*. 



(8) 



era + 22 



min(i,m) >2 



r=l 



min(i,m) 



Emin 



A^r Aj> 



At if « = j, 
if i < j. 



The dimension of the domain and the image space of ([8]) are p(m+ 1) — (™) and ( p ^ 1 ) 
respectively, so the minimum of these two numbers is an upper bound for F pm . To 
prove that this upper bound is tight, we will show that the Jacobian matrix J of the 
parametrization has full rank almost everywhere. The Jacobian matrix has the form 



J 



a A 

I P B 
A 



The entries in the unit matrix I p on the upper left are 



da t . 



1 if t = 
else. 



(9) 



The matrix J has full rank if and only if the (S) x (pm — C%)) -matrix A has full rank. 
The entries of the latter matrix A are 



dip. 



Kl 



si 



X tt if i < j and (i,j) = (t,s), 

Xjt if i = s = t < j or t < i = s < j, 

X it if t < i < j = s, 

else. 



(10) 
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If we set Vi< = (Yv+i> • • • > ^hpf G ^ 1 and 
matrix A can be written in the following form 



(Xj+ij, XpjY G then the 
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zero. 


The submatrices Ajj = 


Ajj Ip—i 


are diagonal, and, for i > 






Aj+ij 
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\i+2,j 



V 



A 



Some of the submatrices in the partition of A may not be present if p is too small. For 
example, the entire lower half of A is not present if p < m + 1. If p < m + l, then A has 
a lower-triangular structure and is clearly of full rank if all An are non-zero. So we will 
assume that p > m + 2, in which case the lower half of A comprises ( p ~^ n ) rows. 

We will now choose a particular matrix A for which A = A(A°) is of full rank. 
The existence of such A implies that the rank of A is full for almost every choice of A. 
The matrix A has entries in {0, 1} with the non-zero entries chosen as follows. For all 
i G [m], we set A^ = 1. As a consequence, the upper right block of A is of full rank 
Y^iLi (p ~ *) = P m ~ i" 1 ^ 1 ) ■ ^he remaining non-zero entries of A are determined as 
follows. Let J(p,m) be the minimum of m and ( p ~^ n )- For j G [J(p, m)], let be the 
integer in {m + 1, . . . ,p — 1} such that the j'-th row of the lower half of A is indexed by 
ipitj) : t with t > + 1. For j = 1, . . . , J(p,m), we set exactly two components of the 
vector A>j equal to one, namely those appearing in that row of A^ j that is part of the 
j-th row of the lower half of A. As examples, consider (p, m) = (7, 4) and (p, m) = (8, 4), 
for which the above procedure selects the two (transposed) matrices 



(Ao)' 



/l 1 1 0\ 

10 10 1 

1 1 1 

\0 1 0/ 



and (A )* 



/l 1 1 0\ 

10 10 10 

1 1 1 

\o 1 1 1 0/ 
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Since the matrix A has entries in {0, 1}, the same holds for the matrix ^4°. A 
submatrix A^- of the upper right block of ^4°, that is, 1 < j < i < m, has only one 
column with non-zero entries because A^- = if i < m and i ^ j. This non-zero column 
is indexed by A^- and can thus be eliminated by subtracting the row of ^4° indexed by 
ipji. This way the upper right block of A is transformed into a unit matrix of size 
YliLlip ~ i) = P m ~ C"^ 1 )' wriUe no fill-in occurs in the upper left block of A . 

Next we eliminate the lower right block of ^4° by subtraction of rows from the upper 
half. This elimination creates fill-in in the lower left block of ^4°. This fill-in is zero 
except for J(p, m) many entries that are all equal to —2. These non-zero entries occur 
in the positions j = 1, • • • , J(p,m), within the lower left block of A . It follows 

that the rank of ^4° is equal to 



pm 



m + 1 
2 



+ J(p, m) = min < pm 



which is the minimum of the number of rows and columns of ^4°. Hence, ^4° is of full rank 
as we had claimed. This concludes the proof of the stated formula for dim(F p ^ m ). The 
codimension is minus dim(F Pjm ), and inequality ([7]) is gotten by solving { p ~^ n ) > m 



for p. 



□ 



Example 3. Let us consider the case of two factors m = 2. The model F p 2 has positive 
codimension if and only if p > 5. For p = 5, Theorem [2] says that ^5,2 has codimension 
1, so it is a hypersurface in the space of symmetric 5 x 5-matrices. The hypersurface is 
defined by the polynomial 



V'l2V'l3''/'24^35^45 ~ 1pl2'4>l3lp25'4>Ulp45 ~ ^12^14^23^35^45 + i>12'lpl4'4>25lpM'4>35 
+^1 2 V>15V'23V'34V'45 - ^12^15^24^34^35 + ^13^14^23^25^45 ~ ^13^14^24 ^25^35 
-^l^l^^i^h + V'13V'15V'24V'25V'34 ~ ^lA^lb^2^2b^M + ^l^lb^1p2i^ ■ 



This is the pentad constraint which was first derived by iKellevI (|l935l ). If * is the 
covariance matrix of a distribution in the model Fs 5 2 then f(^f) = 0, and the pentad 
/ is the unique irreducible polynomial (up to scalar multiplication) with this property. 
In the next section we discuss the use of such invariants as test statistics, and in the 
subsequent sections we derive higher invariants using methods of computational algebra. 

For p = 4, Theorem [2] says that F4 2 has codimension 0, so it is full-dimensional in the 
space of symmetric 4x4-matrices. The theorem does not state that every positive definite 
matrix \& is in the model F&2- All it states is that the decomposition of Proposition [H 
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imposes no equality constraints on the covariance matrix But it does impose con- 
straints in the form of inequations 7^ and inequalities > 0. We will discuss 
this issue in Section 4, after the algebraic set-up of ideals has been introduced. Note that 
the statistical problem of parameter identification corresponds to the algebraic problem 
of solving the equations (fTTj) for the unknowns an, Ajj when the ipy are given. 



3 Invariants as test statistics 

Let G MP xp be a covariance matrix, that is, a positive definite symmetric p x p-matrix, 
and let / be a polynomial in the entries ipij of \&. We write f(^f) for the evaluation 
of / using the numerical values of a particular matrix The polynomial / is called 
an invariant of the factor analysis model F PtTn if /(^) = for all matrices \P in the 
parameter space F p>m . Classical examples of invariants are the tetrad and pentad. If f 
is an invariant of F p>m and ^ is a covariance matrix such that f(*&) 7^ then we can 
deduce that ^ F Pitn . This suggests that model invariants can be used as statistics in 
tests of model fit. We propose the following approach for putting this on a sound basis. 

Assume we observe a sample X\, X2, ■ ■ ■ , X^ of independent random vectors in 
W that are identically distributed according to the multivariate normal distribution 
A/" p (/i, \£) with mean vector [i £ MP and positive definite p x p-covariance matrix Let 
X = jj Y2k=l Xk be the sample mean vector and consider the sample covariance matrix 

1 N 

s = ( s v) = j^^Xk-x^Xk-xy. 

k=l 

Moreover, let / be an invariant of a hypothesized factor analysis model F pm . The sample 
invariant f(S) provides a consistent estimator of the true invariant evaluation f(^f). 
The variance of f(S), which we denote by Vax^[/(5)], can be derived by computing 
appropriate moments of the Wishart distribution according to which the matrix (N — 
1) • S is distributed; compare [Mardia et al.1 (| 19791 . Sect. 3.4) and lWishartl (|l928al ). The 



variance Var^[/(S')] is a polynomial function of the true covariance matrix ^. Replacing 
^ by the sample covariance matrix S in this polynomial yields the estimator Vars[/(,f>)]. 
Using this estimator, we can define the standardized sample invariant 

Zf = /(>9) , (12) 

Proposition 4. Let f be an invariant of the model F Ptm , and let ^ be a covariance 
matrix such that f vanishes at \& but its gradient vector V/ does not vanish at Then, 
as the sample size N tends to infinity, the standardized sample invariant Zf converges 
in distribution to a standard normal distribution: 

Z f ^ d AA(0,1). 
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Proof. The vectorization of y/N(S — \&) converges in dis tribution to a centered multi- 
variate normal distribution. Hence, by the delta method ( Shorack . 2000l . p. 279), 



'Nf(S) = VN[f(S)-fm — M(0,v f ) 
with asymptotic variance 

v f = lim iWar*[/(S)] > 0. 

Since Var# [/(£)]/ Vars [/(£)] converges in probability to one, it follows from Slutsky's 
theorem that 



\/Var s [/(S)] ^JVVar.l/CS)] * V»7 " 

Remark 5. The sample invariant f(S) is typically a biased estimator of /( X I / ). However, 
if the expectation of f(S) is of the form E*[/(S)] = h(N) ■ /(*), where h(N) is a 
function of the sample size only, then one can consider the bias- corrected sample invariant 
f(S) = f(S)/h(N). An analog of Proposition [4] holds when f(S) is replaced by f(S). 

Example 6. We derive the standardized and bias-corrected sample invariants for the 
one-factor model F P) i. Let i, j, k, i be four distinct indices in \p] and consider the tetrad 

/ = ^Pik^je - ipieipjk- (13) 

If ^ is a covariance matrix in F p \ then the tetrad vanishes, i.e., f(*S>) = 0. 

The sample tetrad f(S) = SikSjg — s^Sjk is a consistent but biased estimator of f(^). 
However, the bias can be corrected as described in Remark [5] with the bias- corrected 
sample tetrad being equal to 

N- 1 

f(S) = — - - (s ik Sj£ - SifSjk) ■ 
For any covariance matrix ^f, the variance of this unbiased estimator of f(^f) is equal to 
Var* [f(S)] = (Ar _^|^ _ 2) det(% j}x{i)i} ) • det(tf {M}x{M} ) ^ 

This expression was first computed by IWishartJ (|l928bl ). If * G F p>1 , then 
det{^{i,j}x{k,e}) = ^ik^ji - ipie^Pjk = 0. 



2 
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Thus the last term in (|14p vanishes and we can use the estimate 



Var 5 [/>)] = 

_ fjnj _ 2 ) det (%j}x{*j}) " det ( s {k,e}x{k,e}) - jj—^ det ( s {i,i,k,e}x{i,j,k,e})- 



Following the recipe in (|12p . we introduce the standardized bias-corrected sample tetrad 

f(S) 



Var 5 [f(S)] 



Z f = 

This is an explicit expression which can be evaluated for any sample covariance matrix S 
arising from data X{. If at least one of the four entries ip^i ^jt-, ^Ui ^jk is non-zero, then 
the gradient of the tetrad is non-zero at ^f. Proposition|4]says that Zj has an asymptotic 
standard normal distribution N(0, 1) when the sample size N tends to infinity. 

Suppose now that / is an arbitrary polynomial invariant of the factor analysis model 
F Pjm , and we wish to test the null hypothesis Ht : f(^) = 0. In light of Proposition 
HI we can do this by computing the corresponding standardized sample invariant Zf 
and by comparing it to the appropriate quantile of the standard normal distribution 
AA(0, 1). More precisely, for chosen significance level a G (0, 1), we can find an interval 
[— c a , Cq.] which, assuming Hf is true, contains the standardized sample invariant Zf 
with (asymptotic) probability 1 — a. If we observe a value Zf of Zf that falls outside 
this interval, Zf [— c a ,c a ], then this constitutes evidence against Hf and, in particular, 
evidence against the hypothesized factor analysis model of which / is an invariant. 

If the hypothesized factor analysis model is a hypersurface then consideration of a 
single invariant / is sufficient. This happens, for instance, in the case p = 5, m = 2 
discussed in Example El Here / is the pentad and we only need to test Hf. 

In general, however, the model structure will not be captured in a single polynomial 
invariant. Then we might want to employ a set / of invariants of the considered model to 
test model fit. For instance, / could be a set of ideal generators as in Section 4. A simple 
approach to working with several invariants is to employ Bonferroni's inequality, which 
suggests the consideration of the interval [— c a /\n,c a /\j\]. This interval simultaneously 
contains all standardized sample invariants Zf, f £ I, with probability at least 1 — a. 
Therefore, if one or more observed values Zf fall outside the interval [— c a /m, c a /iji], then 
we have found statistical evidence against the hypothesized factor analysis model. More 
powerful approaches than this simple Bonf erroni method can be ob tained by combining 
the invariants in a qua dratic form; s e e e.g. IffipD and Botol (|2003h for work employing 



tetrads. Alternatively, Spirtes et al. (2000) employ tests of vanishing tetrads to define 



scores for model selection in Gaussian graphical models with hidden variables. 

Tetrads appear to be the only invariants of factor analysis models that have seen 
routine use in data analysis. However, the approach we have outlined above is feasible 
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also for other invariants such as the pentad and the higher invariants we determine sub- 
sequently. The only difficulty involved is the estimation of the variance-covariance struc- 
ture of the s ample invariant s . We expect that recen t work on moments of the Wishart 
distribution ( Graczvk et al. . 20051 ; Lu and Richards . 200ll ) can be applied fruitfully to 
overcome this difficulty. When moments of invariants cannot be determined exactly, 
asymptotic approximations can be derived from the asymptotic c ovariance matrix for 
the sample covariance matrix; see iRoverato and Whittaker (Il998h for a discussion of 
properties of the Isserlis matrix which determines this asymptotic covariance matrix. 



4 Algebraic setup 

We are interested in polynomial relations among the entries of a factor analysis co- 
variance matrix Vl/ G F pm . The mathematical framework for studying such polynomial 
relat ions is that of commutative algebra and algebraic geometry (see e.g. I Cox et al 



19971 ). Many algorithms from these fields are implemented in software for symbolic 
computation, and they provide powerful computational tools for the study of model in- 
yariants. The application of these tools in s tatist ics is the focus of algebraic statistics 



(Pac hter and Sturmfelsl . 1200.4 IPistone et all . l200lh . While algebraic statistics has so far 



been predominantly occupied with the study of models for discrete random variables, 
the present study is one of the first in this emerging field which concerns continuous 
random variables. The set-up to be introduced is fairly general and can be used to 
study arbitrary Gaussian graphical models, not just the factor analysis model. 

We fix the ring of polynomials with real coefficients in the (f^ 1 ) indeterminates ipij: 



t[lpij,i<j] = M[V'll,^12,---,V ; lp,V'22,---,V'2p,V ; 33,---,V ; ; 



ppj 



For any subset F of the symmetric matrices in R pxp , let 1(F) be the set of all polynomials 
/ G R[ipij, i < j] such that /(*) = for all Clearly, 1(F) is an ideal in R[ipij, i < 

j]; that is, the sum of two polynomials in 1(F) is again in 1(F), and the product of any 
polynomial in M.[ipij, i < j] with a polynomial in 1(F) is in 1(F). According to Hilbert's 
basis theorem, every ideal is generated by a finite list of polynomials. We tacitly assume 
this finite representation for all the ideals which appear in the following discussion. 

The object of our interest is the ideal of invariants of the model F Pim , which is 
the ideal I PtTn = I(F PtW ). Since membership in F Ptm depends only on the off-diagonal 
entries of the matrix ^, we can regard I pm as an ideal in the subring M[ipij, i < j] 
of i < j]. If / is any ideal in the bigger polynomial ring i < j] then the 

intersection / D M.[t/jij, i < j] is an ideal in the smaller polynomial ring i < j]. 

Passing to this intersection is the process of elimination of the variables ipn, ■ ■ ■ ,ip pp . 
The following result shows how the ideal I P) m can be computed using elimination. 
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Theorem 7. Let M p-m C Rf^y, i < j] &e the ideal that is generated by all (m + 1) x 
(m + l)-minors of a symmetric matrix \P £ M pxp . T/ien i/ie idea/ 0/ invariants equals 

I P ,m = M P!m n R^ij, i < j]. (15) 

Proof. The proof makes use of standard arguments from algebraic geometry, and all 
varieties V{ ■ ) are understood over the field C of complex numbers. Recall that the 
variety V{M p ^ m ) of the ideal M p ^ m is the set of common zeroes of the polynomials in 
M P)in . This set coincides with the set of all symmetric p x p-matrices of rank at most m. 
Let Fp m denote the set of all p x p-matrices of the form ^ = S + T where £ is a diagonal 
matrix and T £ V(M p ^ m ). Thus F pm is the superset of our parameter space F Pjin gotten 
by dropping the positive definiteness requirement. Since the cone of positive definite 
matrices is open (and hence Zariski dense) in the space of all symmetric matrices, we 
conclude that F Pt , m and F' m have the same Zariski closure in C pxp . The projection of 
this Zariski closure onto the space of off-diagonal entries is Zariski closed, and it coincides 
with the variety V(I PtTn ) of the desired ideal I p<m . On the other hand, every matrix in 
the projection of F p m is the image of a matrix T in V(M P!Tn ). We conclude that V(I Ptm ) 
equals the Zariski closure of the proje ction of V(M Pim ) onto the off-diagonal coordinates. 



By the Elimination Theorem (see lCox et all 119971 . §3.2) we have 



V(M p<m n M[^ y , i < j}) = V(I p>m ). (16) 
Now, it is know n that M Pi m. is a prime ideal and that the minor s form a Grobner basis for 



M p , m ; compare &3 «, Isturmfels and Sulliva^ S, Corollary 4.11, Example 



4.12). The primality of M P)Tn implies that M PtTn n i < j] is prime as well. Since, 

by definition, I Pjm is radical, we apply Hilbert's Nullstellensatz to (fl~6|) to conclude that 
I p , m = M p , m n R[ipij, i<j\. □ 

Theorem [7] allows for the derivation of a finite generating set of the ideal I Pim using 
the method of Grobner bases. This will be explained in Section 5. In the remainder of 
this section, we discuss some consequences and geometric aspects of Theorem [71 starting 
with some polynomials that obviously belong to the ideal I P;m . Up to sign, there are 

1/ p \/2(m + i; 
2 \2(m + l)J V m + 1 

off-diagonal (m + 1) x (m + l)-minors of the matrix that is, subdeterminants that 
do not involve any diagonal entries of Such minors of size m + 1 are trivially in I p , m . 

Corollary 8. Let p > 2(m + 1) and choose two disjoint sets R,C <Z \p] of cardinality 
\R\ = \C\ = m + 1. Then the off-diagonal minor det(^ rxc) is in L P}in . 
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Example 9. Let m = 1. Then the 2 x 2-off-diagonal minors of ^ belong to I Pt \. For 
example, if R = {1, 2} and C = {3, 4} then this minor is the tetrad 



det(^ RxC ) = det ( ^ 14 ) = -013^24 - ^14^23 • 

\1p23 W2A J 

If p < 2{m + 1) there are no off-diagonal minors. However, if p > 2m + 1, it is still 
easy to determine some non-zero polynomials in I Pim by considering two minors that 
contain exactly one common diagonal entry fa and eliminating this diagonal entry. 

Corollary 10. Let m > 2 and p > 2m + 1, and choose i € \p]. Let (R,C) and (R,C) 
be two pairs of m-element subsets of [p]\{£} that are disjoint: RnC = ® = Rr\C. 
Finally, let "J/ be the symmetric matrix whose off- diagonal entries are the unknowns ipij 
and whose diagonal entries are equal to 0. Then the following polynomial is in I P) m : 

fi,R,c,R,c = det(^ x c) • det(^ a)x( . 0) ) - det^ RxC ) ■ det(^ >jR)x(iiC) ). 

The notation (i,R) x (i,C) indicates that the i-th row and column are arranged first. 

Proof. The polynomial fi : R t c,R,c nes m ^bPij> i < j]- The two identities 

det(% fl)x(i , c) ) = det(*J iii)x( . c) ) + fa ■ det(* flxC ), 
det(* (ij K)x(i >( 7)) = det(*° irS)x( . e) ) + fa ■ det(V Rx0 ) 

imply 

fi,R,c,R,c = det(ty RxC ) ■ det(* ( i^ )x(ij(? )) - det(* Sx(3 ) • det(*( i>jB ) X ( ii0 )). 

This is a polynomial linear combination of (m + 1) x (m + l)-minors of ^f, so it lies in 
M Pt m. We conclude that f i R c R q is in the right hand side of (fT5j) and hence in I P)m . □ 

The linear eliminant fi rc RC ^ s a homogeneous polynomial of degree 2m + 1. For 
m = 2 and m = 3, the linear eliminants recover the tetrads and the pentads as follows: 

Remark 11. If m = 1 and p > 4 then we can choose pairs (R, C) and (R, C) satisfying 
the assumptions of Corollary UOi The result is a polynomial combination of two tetrads: 

fi,R,c,R,c = -^rc fa fa + fas fa fa = -fa ■ det(*( rif } x { Cji }) + fa • det(^{ E)C } x { fji }). 

Example 12. Let m = 2 and p = 5. Then the polynomial fmcRc nas degree five, 
and it does not depend on the choices of i, R, C, R and C. Up to sign, it coincides with 
the pentad / which was displayed in Example [3j Note that the twelve monomials in the 
pentad / correspond to the twelve labeled cycles on the set of nodes {1, 2, 3, 4, 5}. The 
ideal I52 is the principal ideal generated by the pentad; in symbols, I^^ = (/)• 
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The following proposition shows that linear eliminants are non-redundant invariants. 

Proposition 13. Let m > 2 and p > 2m + 1. IfRL)C = RUC, then linear eliminant 
fiRCRC is n °t i- n the ideal generated by the off-diagonal (m+1) x (m+1) -minors of ^ . 

Proof. Without loss of generality assume that i = 1 and R U C = {2, 3, . . . , 2m + 1}. 
Since codim(i ? 2 m +i jm ) > 0, by Theorem [2l we can choose a symmetric matrix ^ G M. pxp 
such that (i) fi n c,R,c(^) ano - ( n ) an off-diagonal entries in row 2m+2 to p are zero. 
Then all off-diagonal (m + 1) x (m + l)-minors of the chosen matrix are zero. This 
shows that f\ rc RC cannot be a polynomial combination of the off-diagonal minors. □ 

We believe that the following converse to Proposition [13] holds. As we will see, 
Conjecture 1141 is part of a general series of finiteness conjectures about the ideals L p ^ m . 

Conjecture 14. Let m>2 and p > 2m + 2. If RUC ^ RUC then the linear eliminant 
fiRCRC * s ^ n th- e ideal generated by the off-diagonal (m+1) x (m+l)-minors of ^ . 

We call the linear eliminants where RUC = RUC the (2m + 1)- ads. So when m = 2, 
we recover the pentads and when m = 3 we obtain septads. 

We close this section with a discussion of the geometric role played by the ideal I P;m 
in the context of factor analysis. Recall that the variety V(I Ptm ) is the set of all common 
zeros of the polynomials in the ideal I Pt m- Consider the following four statements: 

(a) A polynomial vanishes on the parameter space F pm if and only if it lies in I p>m . 

(b) The factor analysis model F Pjm is represented by the variety V(I PtTn ). 

(c) The parameter space F PjTn coincides with the variety V(I Ptm ). 

(d) The closure of the parameter space F P)Tn coincides with the variety V(I p ^ m ). 

Then statement (a) is true because this is how the ideal J P:TO was defined. Statement 
(b) is vague, but it expresses the philosophy of this paper, so we simply declare it to 
be true. On the other hand, statement (c) is false with respect to every meaningful 
interpretation of what the statement may mean. In algebraic geometry, V(I Pjm ) denotes 
the set of zeros of I Pim over the field C of complex numbers, and this is what was meant in 
the proof of Theorem [3 Considering the zeros of I PtTn among positive-definite matrices, 
positive semi-definite matrices, or just real symmetric matrices, we get the inclusions 

^ / pd(^p,m) C ^ / psd(-^p,m) C Visi(I Pj m) C V(Ip ;m ). 

So, meaningful interpretations of (c) may be that F p ^ m equals V p d(L p ,m), and that V(I Ptm ) 
equals the set F p m of complex p x p-matrices ^ = S + AA* where S is diagonal and 
A G C pxm . Both of these statements are false as the following example shows. 
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Example 15. Let p = 3 and m = 1. Then F^i consists of all 3 x 3-matrices of the form 

V>12 V>22 "023 = ( cr 22 ] + [ A 2 i ) • (An A 2 i A 3 i) . (17) 
^13 V>23 ^33/ 

The three off-diagonal identities imply 

Af 1 '023 - ^12^13 = ^21^13 - ^12^23 = ~ "013^23 = 0. (18) 

This shows that a matrix in in F% 1 cannot have precisely one zero off-diagonal entry. 
But V(i3,i) consists of all symmetric 3 x 3-matrices since codim(i ? 3 j i) = and I^i = {0}. 
Hence F^ t is a proper subset of V{I^\), and, likewise, F$ t \ is a proper subset of V p a{Iz,i)- 

Let us now come to statement (d). This statement is true over the field C of complex 
numbers. Every matrix in V{I p ^ m ) is the limit of matrices in F~ m . This follows from a 
non-trivial algebraic geometry result to the effect that, for the image o f any polynomia l 



map over C, the usual closure coincides with the Zariski closure (see I Cox et al.l . 119971 . 
Proposition 7, p. 490). On the other hand, statement (d) is false over the real numbers. 
Namely, in our example, V p d(l3,i) is the set of all positive-definite matrices. If ^ is a 
positive-definite matrix with -0i 2 > 0, V13 > and ^23 < 0, then ([18]) forces An to be 
the square root of a negative number, so ^ cannot be in the closure of -£3,1- A similar 
(but more complicated) analysis can be performed for the case p = 4, m = 2 starting 
from the equations given in (jlip . 

In summary, in this paper we do not determine all the constraints satisfied by the 
parameter space F Pt7n of the factor analysis model. What we do determine is the set I p<m 
of all polynomial equation constraints. These characterize the closure of F PtTn if we allow 
complex numbers. The polynomials in I PtTn are the model invariants, and, as argued in 
Section El they can be used to derive novel test statistics for Gaussian graphical models. 

5 Grobner basis computations 

We now focus on computing finite generating sets for the ideals I pm . Following Theorem 
[71 this can be done by equating to zero all (m + 1) x (m + l)-minors of an unknown 
symmetric p x p-matrix ^, and then eliminating the off-diagonal unknowns ipa from 
these equations. In computer algebra, there are two main methods for eliminating 
unknowns from a system of equations: Grobner bases and resultants. In this section we 
present the Grobner basis approach, while resultants will be featur ed in the n e xt sec tion. 



We shall assume familiarity with "Grobner basics" at the level of ICox et al.1 (119971 ). 

The complete answer to our problem is currently only known for the one-factor model 
(m = 1). Namely, as we shall see in Theorem 1161 the tetrads provide a reduced Grobner 
basis for the ideal I Pi \. For two or more factors (m > 2), we did numerous computations 
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with the computer algebra systems Macaulay2 and Singular. The results of these 
computations are presented in this section (see Tables [T] and [2] below). We shall return 
to these results in Section [71 where we offer some conjectures about the ideals ip, m . 

Let m = 1. For any four indices i < j < k < £ in [p], we have the tetrads in ([1]). 
Since the first tetrad is the difference of the third and the second tetrad, it suffices to 
pick out the last two tetrads in dU). Let 

T p = {ipijipke ~ ipik^jt, tpieipjk ~ ipik^je \l<i<j<k<£<p} 



be the set of 2(^) tetrads obtained in this way. As described by Ide Loera et al.1 ()1995l ). 
the underlined terms are the leading terms with respect to a certain monomial order >- 
on R[ipij, i < j]. (They call this monomial order the thrackle order.) 

Theorem 16. If p < 3 the ideal I p> i is the zero ideal. If p > 4, the set T p is the reduced 
Grobner basis of the ideal I V) \ with respect to the monomial order y. 



Proof. The claim follows from Theorem 2.1 in Ide Loera et al. I (|l995h . □ 



If we observe p = 5 variables, then the set T§ contains ten tetrads. In lHarman 



( 1976L p. 76) it is stated that one can find five of these tetrads such that "any other 



condit ions must be linearly dependent on the [five tetrads]. " Similarly, iHipp and Bollen 



(2003, p. 277) state that "to detect the full set of redundant vanishing tetrads when there 



are more than four variables requires careful algebraic derivation" and "i n the case of the 



five-indicator model there will be five nonredundant vanishing tetrads. " lHarman! (|1976l . 
p. 418) outlines a justification of his claim, which, however, is valid only if all ipij are 
non-zero. In a strict algebraic sense, Harman's claim is incorrect because none of the 
ten tetrads in 7s is a polynomial linear combination of the other nine tetrads. 

Moving on to the general case m > 2, we now demonstrate how to compute a minimal 
generating set of the ideal I Pym by means of two software packages for algebraic geometry. 

Example 17 (p = 7, m = 2 in Macaulay 2 ) . The first software we used is the program 
Macaulay 2 due to Gravson and Stillman ( 19981 ) . To compute a minimal generating 



set of the ideal J72 using Macaulay 2, we use the following sequence of six commands: 

R = QQ[pll,p22,p33,p44,p55,p66,p77,pl2,pl3,pl4,pl5,pl6 ) pl7,p23,p24,p25,p26,p27, 

p34,p35,p36,p37,p45,p46,p47,p56,p57,p67, MonomialOrder=>Eliminate 7] ; 
Psi = matrix{{pll,pl2,pl3,pl4,pl5,pl6,pl7}, 

{p 12 , p22 , p23 , p24 , p25 , p26 , p27} , 

{pl3 ,p23 ,p33 ,p34,p35 ,p36 ,p37} , 

{p 14 , p24 , p34 , p44 , p45 , p46 , p47} , 

{p 15 , p25 , p35 , p45 , p55 , p56 , p57> , 

{p 16 , p26 , p36 , p46 , p56 , p66 , p67} , 

{pl7 , p27 , p37 , p47 , p57 , p67 , p77}} ; 
M72 = minors (3, Psi) ; 
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172 = ideal selectInSubring(l,gens gb M72) ; 

mingens 172 

codim 172, degree 172 

The command 172 = ideal select InSubringd ,gens gb M72) performs the actual 
elimination step of deriving I7.2 from M-jp- The command mingens 172 outputs 56 
polynomials which minimally generate the ideal -Z7.2. This list includes 35 polynomials 
of degree three and 21 polynomials of degree five. The latter are 21 pentads like 

P36p37p45p47p56-p35p37p46p47p56-p36p37p45p46p57+p35p36p46p47p57 
+P34p37p46p56p57-p34p36p47p56p57+p35p37p45p46p67-p35p36p45p47p67 
-P34p37p45p56p67+p34p35p47p56p67+p34p36p45p57p67-p34p35p46p57p67. 

Of the 35 polynomials of degree three, 21 are off-diagonal minors like 
P26p35p47-p25p36p47-p26p34p57+p24p36p57+p25p34p67-p24p35p67. 

The remaining 14 polynomials are sums of off-diagonal minors. In fact, we can replace 
these by 14 off-diagonal minors such that the resulting 35 polynomials are minimal 
generators of It 2- This validates the entry for p = 7, m = 2 in Table [2j Finally, the last 
command line informs us that the variety V{Ii2) has codimension 8 and degree 259. 

The notion of "degree" requires an explanation. Next to the codimension, this is 
the most important invariant of an algebraic variety. Suppose that V is a variety of 
codimension c in C. Then the degree of V is the number of points in V D L where L 
is a general affine subspace of dimension c in C. The case c = 1 is familiar: if V is a 
hypersurface, defined by the vanishing of one polynomial /, then the degree of V equals 
the degree of /, and this is the number of intersection points of V with a general line. 

Table [T] summarizes what we know about the codimension and the degree of the 
factor analysis model V(I pm ) for m < 5 and p < 9. In this section we discuss the m = 2 
and m = 3 columns, and in Section 6 we discuss (m,p) = (4,8) and (m,p) = (5,9). 
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Table 1: Codimensions and degrees for the factor analysis model. 
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For m = 2 our computations suggest that the ideal is always generated by minors 
and pentads, but so far we have been unable to prove this for general p. See Section 
[7] for a discussion of this conjecture. For m > 3 we found that the computer algebra 
system Singular performs better than Macaulay 2. Here is a non-trivial computation. 

Example 18 (p = 8, m = 3 in Singular). To compute the ideal is,3 using Singular 
dGreuel et all l200lh . the following sequence of eight commands can be used: 



ring R = 0, (pll,p22,p33,p44,p55,p66,p77,p88, 
pl2,p23,p34,p45,p56,p67,p78,pl8, 
pl3,p24,p35,p46,p57,p68,pl7,p28, 

pl4,p25,p36,p47,p58,pl6,p27,p38, pl5,p26,p37,p48) ,dp; 
matrix Psi[8][8] = pll,pl2,pl3,pl4,pl5,pl6,pl7,pl8, 

pl2,p22,p23,p24,p25,p26,p27,p28, 

pl3 , p23 , p33 , p34 , p35 ,p36 , p37 , p38 , 

pl4 , p24 , p34 , p44 , p45 , p46 , p47 , p48 , 

pl5 ,p25 ,p35 ,p45 ,p55 ,p56 ,p57 ,p58, 

pl6 ,p26 ,p36 ,p46 ,p56 ,p66 ,p67 ,p68, 

pl7 , p27 , p37 , p47 , p57 , p67 , p77 , p78 , 

pl8 , p28 , p38 , p48 , p58 , p68 , p78 , p88 ; 
ideal M83 = minor (Psi ,4) ; 

ideal 183 = eliminate (M83, pll*p22*p33*p44*p55*p66*p77*p88) ; 
nvars (basering) - dim(I83) ; mult(I83); // codimension and degree 
ideal I83min = mstd(I83) [2] ; // minimal generators 
betti (I83min) ; 
I83min; 

In the first command, which declares the polynomial ring, we list the variables in an 
order different from the order chosen in the Macaulay 2 code described in Example [T71 
Together with the option dp, this variable ordering determines a monomial order which 
we found to be advantageous. With this particular order, a modern workstation requires 
less than 10 minutes to execute the remaining seven commands. Other monomial orders 
we considered led to significantly slower computations. 

The function eliminate carries out the elimination step of deriving Jg^ from Ms 3. 
The codimension and degree of the variety V(I%$) are equal to 7 and 1368, respectively, 
as computed in the line following the elimination. The function mstd permits to compute 
the list of polynomials I83min, which minimally generate the ideal Is, 3- According to 
the command betti (I83min) , this generating set consists of 14 polynomials of degree 
four, 260 polynomials of degree seven, and 168 polynomials of degree eight. They are: 

degree ^: Twelve polynomials are off-diagonal 4 x 4-minors. The other two polynomials 
are sums of off-diagonal minors and can be replaced by off-diagonal minors. 

degree 7: The 260 polynomials of degree seven come in two flavors. 
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(a) Of the 260 polynomials, 120 = (5) • 15 are sums of linear eliminants of the type 
described in Proposition [T3j These linear eliminants are equal to fi rc RC 
with R U C = R U C in [p] \ {i} and we call them septads. A septad has 168 
terms, and an example of a septad appearing in the minimal generator is 

P23pl3p46p68pl7p28p47+p67pl8pl3p24p28p36p47-pl3p24p68pl7p28p36p47- . . . 
. . .+pl2pl7p36p26p37p48~2-p23pl7pl6p26p37p48-2+pl3pl6p27p26p37p48~2 . 

While 102 of the 120 polynomials in our output are septads, the other twelve 
can be replaced by septads without affecting the minimal generator property. 

(b) The remaining 260 — 120 = 140 polynomials are of ideal-theoretic nature. 
They are not polynomial linear combinations of the off-diagonal minors and 
septads. However, the squares of the 140 polynomials are polynomial combi- 
nations of off-diagonal minors and septads. Hence these invariants vanish at 
a covariance matrix ^ whenever the off-diagonal minors and septads do. 

degree 8: These 168 minimal generators are also of ideal-theoretic nature. Their squares 
are polynomial linear combinations of off-diagonal minors and septads. 

Table [2] summarizes our knowledge about the composition of a minimal generating 
set of the factor analysis ideal / p>m for m < 3 and p < 9. This table suggests various 
conjectures about the ideals I p ^ m for general p, and we will discuss these in Section [71 
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Table 2: The degrees of the minimal generators of the ideals I p , 



6 Multilinear resultants 

Resultants are a technique for simultaneously eliminating m unknowns from a system 
of m + 1 polynomial equations. While Grobner bases can be used to perform this task, 
we must remember that Grobner bases are a very general method. They often compute 
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too much and are hence too inefficient. The applicability of resultants is more limited, 
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In our algebraic study of factor analysis, we found the Grobner basis computations 
of Section 5 to be infeasible for m > 4. Instead we did some computations using the 
multilinear resultant. In this section we explain this technique, and how it was used to 
derive the degrees 54 and 98 for (m,p) = (5,9) and (m,p) = (4,8) in Table HJ 

Consider a set of n + 1 multilinear polynomials fo, ■ ■ ■ , f n 

in Tt unknowns Xi, . . . , x n \ 



Si 



E • 

h,h>— ,i«e{o,l} 



U = o, i, 



(19) 



Here the coefficients a% ^ + 



are regarded as unknowns. The total number of these 
coefficients is 2 n • (n + 1), and they generate a polynomial ring which we denote by 



,i n G {0,1}, i G {0,...,n}]. 



We write M[a, x] for the polynomial ring generated by the coefficients a\ i • and the 
unknowns x\, . . . , x n , and we consider the ideal {fo, fx, ... , f n ) in E[a, x] which is gen- 
erated by the multilinear polynomials (|19p . We have the following result from algebra: 



Theorem 19. The elimination ideal {fo, f\, . . . , f n ) n R[a] is generated by an irreducible 
polynomial lZ(a) which is homogeneous of degree n\ in the coefficients of each fj. 



Proof. This follows from the results in lGel'fand et al.l (|1994l . Sect. 8. 2. A), applied to the 
special case when the toric variety Xa is the product of n projective lines in its Segre 
embedding. The corresponding polytope Q is the n-dimensional standard cube, which 
has normalized volume equal to n! . The polynomial 1Z(a) is the Chow form of Xa- Q 

We call TZ(a) the n-th multilinear resultant. Here are the first three cases: 

Example 20. If n = 1 then /o = Oq + a i x i an d fx = «o + a\x\. Their resultant equals 



W(a) 

Example 21. If n 
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(20) 



2 then we are considering a system of three bilinear equations 

fo = a oo + a io x i + a oi x 2 + a\ x x\Xi, 

fi = ao + a^xi + 0(3^2 + 0^x1x2, 
h = a oo + a w x i + a oi x 2 + anxix 2 - 
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Their resultant has the following determinantal representation: 
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(21) 



This polynomial has degree six but it is quadratic in the coefficients of each fj. 
Example 22. If n = 3 then the coefficients of /o, /i, /b, /3 form an 4 x 8-matrix 
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"no 


a 



' 

111 
1 

111 

2 

111 

3 

111. 
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Let [ijkl] denote the determinant of the 4 x 4-submatrix of A with columns i,j,k,l. 
Then the multilinear resultant TZ(a) is the determinant of the following 6 x 6-matrix: 



[0124] 


[0234] 


[0146] - 


[0245] 


[0346] - 


[0247] 


[0456] 


[0467] 


[0125] 
h[0134] 


[1234] 
+ [0235] 


[0147] + 
-[0345] - 


[0156] 
- [1245] 


-[1247] - 
-[0257] - 


h [0356] 
h [1346] 


[1456] 
+ [0457] 


[1467] 
+ [0567] 


[0135] 


[1235] 


[0157] - 


[1345] 


-[1257] - 


f- [1356] 


[1457] 


[1567] 


[0126] 


[0236] 


-[1246] - 


f [0256] 


[2346] - 


[0267] 


[2456] 


[2467] 


[0136] 
h[0127] 


[1236] 
+ [0237] 


-[1247] - 
+ [0257] - 


- [1346] 
f [0356] 


-[0367] - 
+ [2356] - 


- [1267] 
f- [2347] 


[3456] 
+ [2457] 


[2567] 
+ [3467] 


[0137] 


[1237] 


-[1347] - 


f- [0357] 


-[1367] - 


h [2357] 


[3457] 


[3567] 



(23) 



The derivation of this 6 x 6-matrix is explained in ISturm fels (2002J, Prop osition 4.10). 
The matrix (|23p differs from the 6 x 6-matrix displayed by Sturmfels ( 20021 ) because the 
latter had some typographical errors. These typos have now been corrected in (|23p . 

These formulas can be applied to algebraic factor analysis as follows. Recall from 
Theorem[7]that the ideal / p>m is computed by eliminating the diagonal unknowns ipn from 
the (m+1) x (m+l)-minors of an indeterminate covariance matrix = (i/Jij)- Multilinear 
resultants are relevant for this task because the minors are multilinear polynomials in 
the ipu. For instance, the derivation of the pentad constraint can be interpreted as 
an evaluation of the first multilinear resultant (|20p . Likewise, the second multilinear 
resultant (|2ip can be used to produce non-trivial invariants in I PtJn when p > 2, m > 8. 

The general method for computing invariants of the factor analysis model using 
resultants is the following. We choose any subset D = {d\, . . . , d n } of cardinality n in 
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\p\ and any subsets Rq, . . . , R n , Co, . . . , C n each of cardinality m + 1 — n in \p]\D such 
that Ri n Ci = for all %. This choice of D,R 9 ,C 9 gives rise to an invariant as follows. 

Theorem 23. Let xj = ip d]dj for j = l,...,n and f k = det(^ DR k xDC k ) for k = 
0, . . . , n. The evaluation of the n-th multilinear resultant at /o, /i, . . . , f n is an element 
of R[ipij, i < j]. This polynomial lies in the ideal I pm > an d it is either zero or it is 
homogeneous of degree 

n 

(m - - + l)-(n + l)! (24) 



Proof. The proof uses the methods of iGel'fand et al.1 (|l994h and is omitted here. □ 



It is instructive to examine this theorem for small values of n. If n = 1 (as in 
Example 120]) then the degree (|24[) equals 2m + 1 and we recover the linear eliminant 
fiRC RC °f Proposition [TUl Here D = {i}, Rq = R, R\ = R, Co = C and C\ = C. If 
n = 2 (as in Example 12 ip then the invariant constructed in Theorem 1231 is homogeneous 
of degree 6m. If n = 3 (as in Example [22]) then the invariant constructed in Theorem 
[23] is homogeneous of degree 24m — 12. In particular, the degree (]24p is 108 for m = 5. 

Example 24 (p = 9, m = 5 using resultants). This is the case appearing in the lower 
right corner of Table [H The variety V(Ig^) is a hypersurface of degree 54 in the space 
of symmetric 9 x 9-matrices. The irreducible polynomial defining this hypersurface is 
the greatest common divisor (gcd) of all multilinear resultants constructed for n = 3 as 
in Theorem 1231 Each resultant has degree 108, and their gcd was found to have degree 
54. In fact, we observed that it suffices to take the gcd of only two such resultants. 

For a concrete instance take D = {1,2,3}, Rq = {4,5,6}, Co = {7,8,9}, R\ = 
{4, 5, 7}, Ci = {6, 8, 9}, R 2 = {4, 6, 8}, C 2 = {5, 7, 9}, R 3 = {4, 7, 9}, C 3 = {5, 6, 8}, and 
let / be the degree 108 polynomial constructed by Theorem 1231 It is infeasible to express 
/ as a sum of monomials. (Note that the number of monomials of degree 108 in the 
36 unknowns ipij exceeds 10 33 ). However, the 6 x 6-determinant f)23f) offers an efficient 
representation of the invariant /. Namely, if the ipij are replaced by linear forms in one 
or two parameters, then this 6 x 6-determinant can be evaluated rapidly. From this we 
see that / is non-zero and that it factors as the product of four irreducible polynomials, 
having degrees 18, 18, 18 and 54. The last factor is the generator of Ig^. 

Let g denote the degree 54 invariant which generates the ideal Ig^. We have repre- 
sented g as the gcd of several polynomials of degree 108, but this representation specifies 
g only up to a multiplicative constant. Up to this constant, we can evaluate g numerically 
at a covariance matrix ^ by choosing a random matrix introducing an unknown, say 
t, computing the gcd of several resultants of the form /(*& + £• ^o) £ ^[t], and evaluating 
the resulting polynomials at t = 0. Repeating the computation at a second covariance 
matrix \E f/ with the same is an efficient scheme for evaluating the ratio g(^)/g(^'). 

Example 25 (p = 8, m = 4 using Macaulay 2 and resultants). The variety V{I^^ 
has codimension two in the space of symmetric 8 x 8-matrices. In order to compute its 
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degree, we intersect the projective variety of Ig^ with a general two-dimensional plane. 
We do this computation over a finite field, using the following Macaulay 2 commands: 



S = ZZ/101[r,s,t] ; 

R = ZZ/101 [pll,p22,p33,p44,p55,p66,p77,p88,r ) s,t,Monomial0rder=> Eliminate 8]; 
f = map(R,S,{r,s,t>) ; 



pl2 


= f (random (l,S)) 


pl3 


= f (random (1,S)) 


pl4 = 


f (random (1,S)) ; 


pl5 


= f (random (1,S)) 


pl6 


= f (random (l,S)) 


pl7 = 


f (random (1,S)) ; 


pl8 


= f (random (1,S)) 


p23 


= f (random (l,S)) 


p24 = 


f (random (1,S)) ; 


p25 


= f (random (1,S)) 


p26 


= f (random (l,S)) 


p27 = 


f (random (1,S)) ; 


p28 


= f (random (l,S)) 


p34 


= f (random (l,S)) 


p35 = 


f (random (l,S)) ; 


p36 


= f (random (l,S)) 


p37 


= f (random (1,S)) 


p38 = 


f (random (l,S)) ; 


p45 


= f (random (l,S)) 


p46 


= f (random (l,S)) 


p47 = 


f (random (l,S)) ; 


p48 


= f (random (l,S)) 


p56 


= f (random (l,S)) 


p57 = 


f (random (l,S)) ; 


p58 


= f (random (l,S)) 


p67 


= f (random (1,S)) 


p68 = 


f (random (l,S)) ; 










p78 = 


f (random (l,S)) ; 



Psi = matrix {{pll,pl2,pl3,pl4,pl5,pl6,pl7,pl8}, 
{pl2 , p22 , p23 , p24 , p25 , p26 , p27 , p28} , 
{pl3 , p23 , p33 , p34 , p35 , p36 , p37 , p38} , 
{pl4 , p24 , p34 , p44 , p45 , p46 , p47 , p48> , 
{pl5 , p25 , p35 , p45 , p55 , p56 , p57 , p58} , 
{pl6 , p26 , p36 , p46 , p56 , p66 , p67 , p68} , 
{pl7 , p27 , p37 , p47 , p57 , p67 , p77 , p78} , 
{pl8 , p28 , p38 , p48 , p58 , p68 , p78 , p88}} 

G = gens gb minors (5, Psi) ; J = ideal selectlnSubringd ,G) ; 

codim J, degree J 



The output of repeated runs verifies that the degree of V^s^) equals 98. However, 
this computation gives no information about the minimal generators of 1%^ and at 
present we do not even know the smallest degree of a non-zero polynomial in /s,4- 

On the other hand, we obtain a large number of non-trivial invariants of degree 24 
by applying Theorem [23] with n = 2. Namely, for any choice of indices D,R 9 ,C, we set 



/ Xl 






tduca 




Ci2 ^ 




X2 


' l Pd2,c i0 




Ipd! 


Ci2 










Vv i0 


Ci2 










A* 


Ci2 




1 Pr i 2,d2 






^r l2 


Ci2/ 



for i = 0,1,2. 



Here, Ri = {rjo, rn, rj2} and Ci = {qo, Qi, c^}- The invariant of degree 24 is obtained 
evaluating the formula (I2ip . in which we abbreviate the coefficients of fi by a l - k = a l j k (ijj). 
Just as in Example [Ml it is impossible to write this invariant as a sum of monomials, but 
it is very easy to evaluate it numerically using the determinantal representation (|2ip . 
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7 Conjectures about generators of the ideals 7 p TO 

In Sections 4-6 we discussed the problem of computing a finite generating set for the 
ideal I pm of invariants of the factor analysis model F p . m . Our computational results 
suggest some natural conjectures and problems about the structure of this generating 
set, and we believe that these will be of independent interest to commutative algebraists. 

The pattern we found for small m, and that we hope is true for larger m, is that as p 
gets large the generators of I Ptm depend on only a certain fixed number of random vari- 
abl es. This type of finit e ness prop erty has frequent occurrences in algebraic statistics. 



See Allman and Rhodes ( 2004! ) and Santos and Sturmfeld ^200$ ) for two instances 



The prototypical conjecture of this type is the following. 

Conjecture 26. The ideal of the two-factor model, I p ^, is minimally generated by 5(g) 
off-diagonal 3 x 3-minors and (|) pentads. 

Conjecture [26] is supported by the numerical evidence compiled in Table[2j Note that 
all of the minors and pentads described involve at most six random variables. However, 
unlike in the case of the 1-factor model, there does not seem to be any term order that 
makes the collection of off-diagonal minors and pentads a Grobner basis for I p ^. 

The most natural term order we discovered in our computations has already been 
introduced in Examplell8l In general, this is the lexicographic term order with ipij >- ip^i 
if the circular distance between i and j is smaller than the circular distance between k 
and I. (To compute the circular distance between i and j, place the numbers 1, . . . ,p 
equispaced around a circle and measure the distance by taking the shortest path around 
the circle between i and j.) If these circular distances are the same, we declare ipij >~ ipki 
if i < k. We call this term order the circular lexicographic term order. The fact that 
each of the ideals I Pi m is an m -th secant ideal, together with the machinery developed in 
Sturmfels and Sullivantl (j2005h . and our computations led us to the following conjecture. 



Conjecture 27. The circular lexicographic term order is 2- delightful for I Py \. More 
specifically, the reduced Grobner for I P} 2 consists of certain explicitly constructed poly- 
nomials of odd degree less than p with squarefree initial terms. 



The notion of a delightful term order was developed in lSturmfels and Sullivantl (120051 ) 
and the technical details are beyond the scope of this short section. However, the basic 
idea is that, if Conjecture [27] is true, information about the initial ideal and reduced 
Grobner basis of the two-factor ideal I p p can be deduced from the reduced Grobner 
basis of the one-factor ideal I p .-\ using graph theory. We refer the interested reader to 



Sturmfels and Sullivantl l|2005l ) for information about delightful term orders. 



Moving to three factors, the situation seems even more complicated. As we have 
already seen in Table (2] the minors and septads are not enough to generate the entire 
ideal I p $. The minors and septads do, however, determine the parameter space F p $ 
set-theoretically in all the examples we were able to compute. This leads us to suspect: 
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Conjecture 28. Let J p ^ be the ideal generated by all the 4x4 off-diagonal minors and 
all the septad linear eliminants. Then the radical of J p ^ is the prime ideal I p ^. 

We do not have any concrete conjectures about the generators or Grobner bases of 
I Pt 3. Note that each of the minors and septads involve at most eight random variables. 
Is it possible that this type of finiteness behavior continues for larger m? For a subset 
A C \p] denote by Ia,™ the ideal I\A\,m with indices labeled by the elements of A. 

Question 29. For each integer m > 1 does there exist another integer s(m) such that 
I p ,m = ^ lA < m for a11 P > s ( m ) ? 

A<z\p],\A\=s(m) 

Does there exist a different number t(m) where the equality holds up to radical? 

Our computations aside, there is some theoretical evidence to suggest that the set- 
theoretic finiteness result might hold with t{m) = 2m + 2. Namely, we can show that 
the set-theoretic finiteness result does hold in the complement of a certain hypersurface. 

Proposition 30. Let ^ be a symmetric p x p-matrix with p > 2m + 2 and suppose that 
no m x m-minor of ^ is zero. Then ^ £ F p ^ m if and only if *$>a,A £ -^2m+2,m f or a H 
subsets A C \p] of cardinality \A\ = 2m + 2. 

Proof. The "only if" direction is trivial. To prove the "if" direction we first need to 
refer to two simple results about factor analysis models. First of all, if ^ G F P)Tn and 
has no m x m minor equal to zero, then the decomposition \& = £ + V with S diagonal 
and rank(r) = m is unique. Furthermore, if T is a symmetric rank m matrix and A and 
K are p x m matrices with Y = A A* = KK l t hen t here exists an orthogonal matrix Q 
such that A = KQ. See Anderson and Rubin ( 19561 ) for both of these results. 



Now we prove the "if" direction by induction on p. The induction base is p = 2m + 2 
in which case the statement is vacuous. Suppose that \I> satisfies \& a,a £ -^2m+2,m for 
all subsets A C [p\ with |A| = 2m + 2. Denote by ^ + the submatrix ^[p-i] [p-il? by \E'_ 
the submatrix ^[p]\{i},[p]\{i} an d by $1 the submatrix ^[ p ]\{i, p } t [ p ]\{i tP }- In other words, 

is the upper left (p — 1) X (p — 1) submatrix, ^_ is the lower right (p — 1) X (p — 1) 
submatrix, and is the (p — 2) x (p — 2) submatrix where ^ + and 1 I / _ overlap. 

By the induction hypothesis, *$> + and belong to -F p _i jm , and so there exist unique 
S + , diagonal and T + , T_ of rank m such that ^ + = S + + T + and = £_ + T_. 
Furthermore, these provide a unique representation for the overlap \P_ = Sl+TZ, where 
SZ and rz are the common overlapping portions of S + and £_ and, respectively, T + and 
r_. Let r + = AA* and T_ = KK l be rank m factorizations of A + and A_. Because of 



the un iqueness in the overlap A_, and the second result taken from I Anderson and Rubin 



(jl956h mentioned above, we can assume that the last p — 2 rows of A coincide with the 



first p — 2 rows of K. Now form the p x p matrices 

S f S U s ° \ and f = AA* where A = ( ^ 
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Set \P = S + f . We claim that ^ = ^ and hence \P G -Fp,m ; which completes the 
proof. Note that trivially rpij = ipij except possibly for the pair = (l,p). So we 
must show that ip± p = ip± p . Let B and C be disjoint subsets of of cardinality 

\B\ = \C\ = m. Then the following three (m + 1) x (m + 1) minors are equal to zero: 

*ixc _4>i P 

^BxC ^Bxp 

The first and last minors are zero because ^> AxA and ^> AxA belong to F2 m +2,p for any A 
with \ A\ = 2m+2 by assumption. The middle minor is zero because it is entry-wise equal 
to the first minor. Since by assumption det(^ bxc) / Owe deduce that ip\ p = ipip. □ 

The proof technique we present does not allow the extra condition in Proposition [30] 
to be dropped and thus we are a long way from answering Question 1291 



^BxC ^Bxp 



^lxC Ipip 
^BxC ^Bxp 



8 Computer algebra for Gaussian models: the next steps 

The research presented in this paper merely scratches the surface of possible applica- 
tions of computer algebra techniques for studying the factor analysis model and more 
general models for Gaussian random variables. In this final section, we highlight two 
such problems: maximum likelihood (ML) estimation and the study of singularities. An- 
other important direction is the design of test statistics from higher invariants (pentads, 
septads, etc.) by computing moments of the Wishart distribution (cf. Section [3|). 

Since much statistical inference in factor analysis is currently based on ML estima- 
tion, it seems natural to ask what computational algebra has to say about computing 
ML estimators for factor analysis. As a starting point, we computed the maximum 
likelihood degree for th e one- factor model with four observed variables. The ML degree 
( Catanese et al. . 20051 ) of a statistical model is the number of nontrivial complex zeros 
of the critical equations for generic data. We found that the factor analysis model F^i 
has ML degree 57. In what follows, we describe how we discovered the number 57 to be 
the number of complex zeros of the critical equations for ML estimation in this model. 

Consider a matrix ^ £ F Ptm with decomposition ^ = £ + T, where £ is diagonal 
with positive entries and T is positive semidefinite with rank(r) < m. Then we have 

= E^-^rxr 1 . 

The matrix x I /_1 rS _1 is symmetric of rank < m. Moreover, the fact that ^ — S = T is 
positive semidefinite implies that — = ^I'~ 1 rS~ 1 is also positive semidefinite. 
Hence, the inverse of \E r G F P)m can be written as 

= T - KK\ 
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where T is diagonal with positive entries and K 6 R p> 
m = 1, this amounts to writing in terms of r = (n, . 



T, « 



diag(r) 



UK 



. In the case of p = 4 and 
, T4) and re = (ki, . . . , K4) as 

(25) 



As in Section [31 let X and S be the sample mean vector and the sample covariance 
matrix computed from a sample of N random vectors. For ML estimation it is more 
convenient to work with the matrix S = (N — 1)/N ■ S instead of S. The model of 
multivariate normal distributions M(fi, VP) has the log-likelihood function 



N N ~ , N - 

= — logdetpf) - y tr^" 1 ) - -{X 



(26) 



Mardia et al. (|l979l . Sect. 4.1.1). This function is maximized in [i by setting 



compare 

fj, = X, which leads to the vanishing of the quadratic form appearing as third term in 
(|26p . Therefore, we can find the maximizer in by maximizing the expression 



log det(^ 



trOSf- 1 ). (27) 

Here ^ runs over F^i. By plugging (|25p into (|27p . we can write this expression as a 
function of the eight unknowns rj, T2, T3, T4, «i, «;2> K 3, K 4- Taking partial derivatives and 
setting them to zero, we obtain a system of eight equations in eight unknowns. These 
are the likelihood equations of the factor analysis model F4 1 in rational function form: 



1 



adet(^- 1 (r, k)) 
det(*- 1 (r, k)) ifhi 

1 9det(^- 1 (r,K)) 



det(^- 1 (r, k)) 



tr 



tr 



5 



S* -1 ^, re) 



5re, ; 



1 4, 



These equations can be made polynomial by multiplying through by det( l I /_1 (r, re)). 
Clearing the denominator introduces many additional solutions to the system, namely 
noninvertible matrices of the form \& _1 (r, re). However, these extraneous solutions can 
be removed using an operation called saturation. After saturation, we discover that 
the solution set of the polynomial likelihood equations consists of 57 isolated (complex) 
points. Clearly, these 57 solutions come in pairs (r, ±re); one solution has re = 0. Further 
work needs to be done to determine how many of these can be statistically meaningful 
local maxima and to extend these results to larger models. 

The second problem we wish to illustrate is that of singularities. The singularities 
of statistical models play an important, though under-appreciated, role. Models with 
singularities do not form curved exponential families, which invalidates, for example, the 
theor etical basis of model selection using information criteria like BIC (Geiger et all 



l200lh . Near a singularity, such criteria require correction terms (jWatanabd . 
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This is especially important in factor analysis because, among other singularities, each 
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parameter space F p>m is singular along F Pim -i making the selection of the number of 
factors difficult. The fact that _F P)jn _i is contained in the singular locus of F p ^ m can be 
proved by observing that V{I Pjin ) is the m-th secant variety of V(I P) i) and by appealing 
to general results in algebraic geometry about singularities of secant varieties. 

As a first step towards a better understanding of singularities, we computed the 
singular loci of some of the small factor analysis models. The computation of the singular 
locus is done, in Macaulay 2 or Singular, by augmenting I Pjm by the c x c-minors of 
the Jacobian matrix of any generating set of I p>m , where c = codim(/ Pjm ). 

Example 31. For p = 4 and m = 1, the singular locus consists of all matrices which 
have at most one non-zero off-diagonal entry. Thus, the singular locus consists of one 
symmetry class of covariance matrices. For instance, one type of matrix in the singular 
locus has the form 



Mi 


•012 





°\ 




022 














033 





V o 








■044/ 



Example [31] generalizes to an arbitrary number of observed random variables when 
the number of factors is fixed at m = 1. 

Proposition 32. A matrix ^ G Mr(^p,i) is a singularity of the one factor model if and 
only if \E f has at most one non-zero off diagonal entry. 

Proof. This can be seen by computing derivatives of the tetrads and by using results in 
toric geometry. We omit the details. □ 

For more factors, the situation is considerably more complicated. 

Example 33. Let p = 5 and m = 2. In this case is the pentad hypersurface. 
The singular locus of this hypersurface has dimension 11, and consists of two symmetry 
classes of singularities. A representative from the first symmetry class is the set of 
matrices of the form 














o \ 







023 


0>24 


025 





1p23 


^33 


0^34 


035 





•024 


^34 


044 


045 


V o 


025 


V>35 


045 


055/ 



The second symmetry class is more complicated. A representative set consists of those 
matrices ^ that satisfy all tetrads not involving ipi2- Note that this set of singular 
points contains -^5,1- In total there are five elements of the first symmetry class and ten 
elements in the second. To apply algebraic geometry techniques to these singularities for 
model selection requires a careful analysis of the way the various singular sets intersect. 
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It is an open problem to determine the singular locus of the factor analysis models 
F Pt7n in general. Even the dimension of that singular locus is unknown to us. 

Finally we stress that while the algebraic statistical study in this paper was confined 
to factor analysis models, problems analogous to the ones described here appear in other 
classes of Gaussian graphical models. The study of such other models, which need not 
involve hidden variables, opens up a broad range of directions for future research. 
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